While browsing Taginfo I got curious how many elements have at least one key described on the Wiki and how big share of all keys the described ones make up. Therefore, I decided to check it out.
The analysis consisted of the following parts:
- fetching OSM database dump from planet.osm.org;
- fetching key statistics from Taginfo with the API;
- extracting “is in Wiki” info into separate file;
- altering “is in Wiki” info for keys which were described on the Wiki after the database was dumped. The alteration was based on the recent changes registry;
- processing the dump with DuckDB:
- extracting element type, its ID, and its tags to new table:
CREATE TABLE elements AS SELECT kind, id, tags FROM ST_READOSM('planet-latest.osm.pbf');; - exploding keys to separate records:
CREATE TABLE elements_keys AS SELECT kind, id, UNNEST(map_keys(tags)) FROM elements;;
- extracting element type, its ID, and its tags to new table:
- querying the database.
These are queries I provided to DuckDB:
| Result | Query |
|---|---|
| number of all elements | SELECT COUNT(*) FROM elements; |
| number of tagged elements | SELECT COUNT(*) FROM elements WHERE tags IS NOT NULL; |
| number of elements with key(s) described on the Wiki | SELECT COUNT(*) FROM (SELECT DISTINCT kind, id FROM elements_keys WHERE "key" IN (SELECT "key" FROM 'keys_wiki.csv' WHERE in_wiki)); |
| number of all keys | SELECT COUNT(*) FROM (SELECT DISTINCT "key" FROM elements_keys); |
| number of keys described on the Wiki | SELECT COUNT(*) from (SELECT DISTINCT "key" FROM elements_keys WHERE "key" IN (SELECT "key" FROM 'keys_wiki.csv' WHERE in_wiki)); |
I got the following results:


